Db2 Database Partitioning vs Range Partitioning

您所在的位置：网站首页 › db2 partition by range › Db2 Database Partitioning vs Range Partitioning

Db2 Database Partitioning vs Range Partitioning

#Db2 Database Partitioning vs Range Partitioning| 来源: 网络整理| 查看: 265

#Db2 has the ability to create a single database that runs in different physical servers in a MPP fashion. This feature was called DPF (Database Partitioning Feature), now it is called Db2 MPP so people can understand better what it means: It allows to have different physical servers running a single Db2 database.

A Db2 database partition is a complete Db2 engine with its own logs, cache (bufferpool), etc. but from the user perspective we have only one databaseand the queries will run in parallel in all the database partitions. Each partition will own a subset of all the rows of the table. This Architecture is called Shared Nothing because each Db2 partition does not shared anything with the rest of the Db2 partitions. Not even the disk as we can set this configuration, using local disks (but if you want HA you will need to put the data in an external storage, so when one machine fails other can take over its data).

So when you run a query it will be run, in parallel, against each database partition. For example, if you have 4 partitions and you run a query, Db2 internally will run 4 queries in parallel. Every query will be solved in parallel against its own data. This technique in Computer Science is called “Divide and Conquer” where we divide a problem (in this case a big volume of data) in smaller volumes to attack them in parallel. At the end we are achieving speed. At not only for queries, also backups, reorgs, loads, etc. are run in parallel.

This kind of parallelism is called inter-parallelism as is obtained thanks to have multiple database partitions. Besides Db2 does many thing in parallel inside each database partition or when you don’t use MPP (then it is like having 1 database partition).

With MPP the data is partition using a hash algorithm that will split the rows in a way that every partition has approximately the same number of rows. Hash partitions is very different from partition by range (e.g. by date). Hash partitioning allows that, for example, when you run a query against January all of the partitions will run and not only the one containing January (as in range partitioning) because January, thanks to the hashing algorithm, will be present in every partition.

This kind of partitioning can also be used together with range partition. If you do this each database partition will be divided by the range you choose. If your range, for example, is the month, January will be present in every database partition and when you run a query against January every database partition will run it, but this time only accessing the January rows. This way Db2 maintains the query parallelism but this time against a even smaller portion of data (January rows). So both Database partition and Range Partition can be used together.

Now in Db2 v11 you can deploy your Db2 BLU database using MPP. Before V11 BLU was limited to one machine and now it can scale horizontally, allowing huge databases.

【本文地址】

Db2 Database Partitioning vs Range Partitioning

Db2 Database Partitioning vs Range Partitioning

今日新闻

推荐新闻